Toggle navigation
Home
About
About Journal
Historical Evolution
Indexed In
Awards
Reference Index
Editorial Board
Journal Online
Archive
Project Articles
Most Download Articles
Most Read Articles
Instruction
Contribution Column
Author Guidelines
Template
FAQ
Copyright Agreement
Expenses
Academic Integrity
Contact
Contact Us
Location Map
Subscription
Advertisement
中文
Journals
Publication Years
Keywords
Search within results
(((SUN Guozi[Author]) AND 1[Journal]) AND year[Order])
AND
OR
NOT
Title
Author
Institution
Keyword
Abstract
PACS
DOI
Please wait a minute...
For Selected:
Download Citations
EndNote
Ris
BibTeX
Toggle Thumbnails
Select
Oversampling method for intrusion detection based on clustering and instance hardness
WANG Yao, SUN Guozi
Journal of Computer Applications 2021, 41 (
6
): 1709-1714. DOI:
10.11772/j.issn.1001-9081.2020091378
Abstract
(
336
)
PDF
(1211KB)(
508
)
Knowledge map
Save
Aiming at the problem of low detection efficiency of intrusion detection models due to the imbalance of network traffic data, a new Clustering and instance Hardness-based Oversampling method for intrusion detection (CHO) was proposed. Firstly, the hardness values of the minority data were measured as input by calculating the proportion of the majority class samples in the neighbors of minority class samples. Secondly, the Canopy clustering approach was used to pre-cluster the minority data, and the obtained cluster values were taken as the clustering parameter of
K
-means++ clustering approach to cluster again. Then, the average hardness and the standard deviation of different clusters were calculated, and the former was taken as the "investigation cost" in the optimum allocation theory of statistics, and the amount of data to be generated in each cluster was determined by this theory. Finally, the "safe" regions in the clusters were further identified according to the hardness values, and the specified amount of data was generated in the safe regions in the clusters by using the interpolation method. The comparative experiment was carried out on 6 open intrusion detection datasets. The proposed method achieves the optimal values of 1.33 on both Area Under Curve (AUC) and Geometric mean (G-mean), and has the AUC increased by 1.6 percentage points on average compared to Synthetic Minority Oversampling TEchnique (SMOTE) on 4 of the 6 datasets. The experimental results show that the proposed method can be well applied to imbalance problems in intrusion detection.
Reference
|
Related Articles
|
Metrics
Select
Fake news content detection model based on feature aggregation
HE Hansen, SUN Guozi
Journal of Computer Applications 2020, 40 (
8
): 2189-2193. DOI:
10.11772/j.issn.1001-9081.2019122114
Abstract
(
662
)
PDF
(845KB)(
637
)
Knowledge map
Save
Concerning the problem that detection performance and generalization performance of the classification algorithm model in fake news content detection cannot be taken into account at the same time, a model based on feature aggregation was proposed, namely CCNN (Center-Cluster-Neural-Network). Firstly, the global temporal features of the text were extracted by bi-directional long and short term recurrent neural network, and the word or phrase features in the range of window were extracted by Convolutional Neural Network (CNN). Secondly, the feature aggregation layer based on dual center loss training was selected after the CNN pooling layer. Finally, the feature data of Bi-directional Long-Short Term Memory (Bi-LSTM) and CNN were stitched into a vector in the depth direction and provided to the fully connected layer. And the final classification result was output by the model trained by uniform loss function (uniform-sigmod). Experimental results show that the proposed model has an F1 value of 80.5%, the difference between training and validation sets is 1.3%. Compared with the traditional models such as Support Vector Machines (SVM), Naïve Bayes (NB) and Random Forest (RF), the proposed model has the F1 value increased by 9%-14%; compared with neural network models such as Long Short Term Memory (LSTM) and FastText, the proposed model has the generalization performance increased by 1.3%-2.5%. It can be seen that the proposed algorithm can improve the classification performance while ensuring a certain generalization ability, so the overall performance is enhanced.
Reference
|
Related Articles
|
Metrics